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Abstract 

This paper proposes an approach to implement optimal control laws of nonlinear systems in real time. Our 
methodology does not require solving two-point boundary value problems online and may not require it off-line either. 
The optimal control law is learned using the original Sugeno controller (OSC) from a family of optimal trajectories. 
Vfe compare the trajectories generated by the OSC and the trajectories yielded by the optimal feedback control law 
when applied to Zermelo’s ship steering problem. 


1. Introduction 

Optimal control [Biyson, 1996; Kirk, 1970] is one of oldest approaches to control engineering. It has many advantages: 
(1) State and control constraints can be include explicitly. (2) The cost function to be minimized can be often given 
a simple intuitively appealing interpretation. (3) Optimal control is a very general methodology applicable to multi- 
input-multi-output, nonlinear, stochastic, or infinite-dimensional systems. Hence, optimal control theory provides a 
unified approach to stating and solving very general control problems that are at the same time physically intuitive. 
Unfortunately, optimal control theory suffers from a major disadvantage; namely, solving optimal control problems is in 
general compuationally difficult except in very special cases where a closed-form expression of the control law can be 
obtained. These cases include many nonlinear second-order systems and the celebrated linear quadratic regulator In 
general however the necessary conditions have no closed-form solution and are at least as difficult to obtain as to solve a 
nonlinear two-instant boundary value problem (for the control of a system described by deterministic nonlinear ordinary 
differential equations. When the plant is stochastic or infinite-dimensional, the numerical difficulties are compounded.) 

The absence of simple closed-form solutions and online numerical solutions of the general open-loop control 
problem means that there is no general feedback implementation (except in the neighborhood of an optimal reference 
trajectory using the well-known neighboring optimal control [Bryson and Ho; 1975].) The lack of feedback 
implementation is in our opinion the main reason why interest and research conducted in optimal control has greatly 
diminished. 

On the other band, fuzzy-logic controllers (FLCs) are essentially feedback control laws. While theses controllers 
canbe easily made to incorporate the heuristic knowledge of the control engineer; and this can be an advantage in cases 
where this is about the only knowledge available, designing a FLC using detailed, mathematical, and exact descriptions 
of the plant is not very well-understood or practiced. 

Clearly, using all available knowledge about the system should in principle yield control laws with superior 
performance. Hence, we investigate in this paper the possibility of designing fuzzy logic controllers that approximate 
optimal control laws; from another point of view; we investigate feedback implementation of optimal control laws 
using fuzzy-logic controlled 

To illustrate this approach, we consider the Zermelo’s problem; that is, the problem of docking a ship going at 
constant water speed in minimum time in a region of strong water aments using the heading angle as the control i nput. 
Wb obtain a family of open-loop solutions of this problem and use it to train the OSC. The resulting trained engine 
will then be a feedback implementation of (a least-squares approximation of) Zermelo’s optimal control. The Sugeno 
controllers [Buckley, 1993] are capable of approximating any continuous map within an arbitrary accuracy. 

This paper is organized as follows. Section 2 provides the necessaiy background information on the optimal 
control of the ship steering problem. Section 3 discusses the training procedure used in designing the Sugeno-type 
controller from the data obtained from the optimal trajectories. Section 4 discusses the generation of training data and 
the elimination of angle discontinuity. Finally, section 5 summarizes the used procedure and shows simulation results. 


2. Zermelo’s Optimal Control Problem 

The objective of Zermelo’s problem is to find a minimum-time path through a region of position-dependent vector 
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velocity [Biyson and Ho, 1975]. In this problem, a ship must travel in minimum time through a region of strong 
currents denoted by the two component vector v(x) 


v 1 -v 1 {x i ,x 2 ) and v 2 -v 2 (x 1 ,x 2 ) (1) 

where (%i,x 2 ) represent the position of the ship in rectangular coordinates and (^ 1 ,^ 2 ) are the velocity components in 
the same coordinate system, and the control u is the steering angle 0, or u = 6. The magnitude of the ship’s velocity 
relative to the water, V, is a constant. The problem is to steer the ship in such a way as to minimize the time necessary 
to go from an arbitrary position to a specified docking position Xf. 

The purpose of the generalized Sugeno controller is to approximate the steering angles needed to generate these 
minimum time paths as a function of x. 

The equations of motion are as follows: 


x = 


ir x 


v i 4- V cos 6 
v 2 + V sin 9 


= v ( x ) 4- V 



( 2 ) 


where u = 6 is the heading angle of the ship’ s axis relative to the coordinate axes and is the control signal. The 
Hamiltonian of the system is: 


H = Xi(V cosd+m) 4- A 2 (Fsind +v 2 ) 4- 1 


( 3 ) 


and the Euler-Lagrange equations are Ai = - ,^2 = - , and O = whose solution is tan 9 = ^ . The 

optimal trajectories satisfy the boundary conditions x(to) and x(tf ) specified. Since the Hamiltonian is not an explicit 
function of time, H = constant is an integral of the system. Furthermore, since the objective is to minimize time, this 
constant is 0. Wfe have five equations to solve for the unknowns x(t),\(t) lort <E [0, tf], and for tf. 


Following [Biyson and Ho, 1975], we can simplify the two-point boundary problem by solving for Q to obtain 


6 — sin 2 4- sin 

UX\ 


a (dvx &v 2 \ 2 a dv! 

in 6 cos 6 — — — — cos 6 

\ox 1 ox 2 ) ox 2 


( 4 ) 


Equations (4) and (2) are the necessary conditions satisfied by this new and reduced-ordertwo-point boundary value 
problem. The four boundary conditions are: x{t Q ), and x(tf) are specified. They are used to solve for {x(t). 6(l)) 
from to to l f and for tf itself. Note that if v{x) were constant, then 6 would be a constant. In other words, the 
minimum time paths are straight lines. If v(x) vanes, it is possible for some of the optimal trajectories to intersect at 
conjugate points x c . For these trajectories to be considered optimal solutions, the control, u* (t) = 0* (t), must satisfy 
the following sufficient conditions 

d 2 H(z"(t), OUt), v 

dO 2 V + Vi cos 0 + v 2 sin 9 

which is clearly is positive &finite if 

V > Vi cos 6 + v 2 sin 9 or if V > ||u|| = y'rq 2 4-u 2 
For further discussion of Zermelo’s problem, see [Biyson and Ho, 1975]. 


( 5 ) 

( 6 ) 


3. Approximation using the Sugeno Controller 

A generalized Sugeno-type controller [Buckley, 1993] is a fuzzy engine mapping a vector x = [x^, x 2 , ■ -., x n ] r e 
into u € 3? where x is interpreted as being a state vector or a measurement and u as a control action. The inference is 
of the form: 


R k : IF X! is A\ and x 2 i s A 2 and.. .and x n is A k THEN u = y k = Pk{x) (7) 

where x^ is the i th component of the input vectors and is a crisp value, A k specifies which among the fuzzy attributes 

of X{ is tested by rule k, and P k is a polynomial insi,^ 2 , . ... x n assigned to -uby the ferrule. The rules of the original 
Sugeno controller (OSC) have the following form 

R k : IF aiiis A^and^is A 2 and...x n is A l n THEN u =y k =c§ 4 -CjX^ ++c k x n (8) 

where c$, c k ,- . . , c£ are the consequence coefficients of the k th fuzzy rule. For further discussion of Sugeno-type 
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controlled see [Buckley, 1993]. Buckley proved that a Sugeno-type controller can approximate any continuous real- 
valued function in the output space to any degree of accuracy if: (a) the input fuzzy sets have continuous membership 
functions and (b) a continuous T-norm is being used in the rule evaluation process. This is the universal approximating 

property of the Sugeno-type controller 

Next, we consider approximating the trajectories of the optimal feedback control law by the original Sugeno 
controller, lb do so, we need to determine the coefficients Cq, cj 5 , etc. In this paper, we use subscripts to index vectors 
and superscripts to identify components within a vector In general, the output u for the inputs sq ,...,x n is obtained by 
the centroid method of defuzzification. 

Let (x 3 , u 3 ) be the j lh training input/output pair out of a total of J pairs. In this paper; these training data are obtained 
from the generated optimal trajectories. Then the consequence parameters can be obtained by solving a recursive least 
squares parameter identification problem [Takagi and Sugeno. 1985] where we determine the unknown coefficients by 
minimizing the error index 

j 

min J = - u 3 f (9) 

j=l 

where u 3 is the output of the optimal feedback control law and u 3 is the output of the Sugeno controller. The necessary 
conditions satisfied by the solution is ZC = U where 


c = 

c' 

,c 3 = 

<4 

U = 

u’ 

’ z = 

' / 3 k ®X 1 ' 
. /3 2 ®a: 3 


C K 


. c i . 


u 


_ /3 J ®X J _ 


where Z is a J x K(n + 1) matrix, where X 3 = [1, x{, • ■ ■ , x 3 and 

= P k (x 3 ), & nd(3 j = (3(x 3 ) (II) 

represent the truth values of the rules evaluated at the vector x 3 . The least squares solution for C can be calculated 
recursively by using the following procedure [Takagi and Sugeno, 1985 and Ljungand Soderstrom, 1986]. Denote the 
j th row vector of matrix Z defined in (??) by z } and the j th row of U by u 3 . Then, C can be then computed using the 
iteration. 


cO+i) 

= c« 4- S^ +1 ) • z ]+ 1 ■ (^' +1 - Zj + 1 ■ c °' } ) 

(12) 


S^) . SO) 


s(j+i) 

- S 0) - *7' +1 

1 + z j+l ■ SO) . zj +1 

(13) 


where is a square (n(k + 1) x n(k + 1) ) covariance matrix at the j th iteration (i.e., after the j th training pair has 
been acquired and used), and the corresponding coefficient vector Then C ! - J i at the final iteration is the least 
squares solution. The initial estimates, and S^ 0 ), are chosen as C® = O and S ^ = a j where a is a large 
number and I is the identity matrix 

Note that if rule 1 never fires (i.e. = O for all j), Then Z is not full rank and ZC = U has no unique least- 

squares solution. Hence, if a rule never fires for the training data given, this rule should be eliminated to make the 
solution of the least squares problem unique. Also, this rule will not be applicable or relevant in all trajectories similar 
to the training data. 

When the generalized Sugeno controller is used, the above procedure remains largely unchanged except that X 
now becomes for the case of n = 2 

X = [I,xi,x2,xl,xix2,xl,xl,xlx2,xix? 2 ,xl, ■ ■ ■] (14) 

and the definition of c k is to updated accordingly so that y k defined in Eq. (7) can be expressed in the form y k = Xc k . 


4. Methodology and Procedure 

This section proposes a technique to approximate a feedback implementation of optimal controls. It uses the data 
generated from the optimal control law to identify the coefficients of the generalized Sugeno controller Here, we do 
not need to solve the two-point boundary-value problem for a arbitrary but given xtf, we only need to generate a family 
of optimal solutions of x{t) in which an optimal trajectory reaches the final docking position x f at some final steering 
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angle Of, To generate one such trajectory, we use integrate Equations (4) and (2) backwards in time from x(tf)~ Xf 
(the docking position) and 0(tf) = Of for any desired tf until t = O. Optimal solutions are generated for two cases 
of«i(x) and V 2 (x). 

Wfe consider the simple case where the current velocity varies linearly. The objective is to find the minimum- 
time path from a certain point x 0 to a docking position at the origin. The velocity components of the currents are the 
following: 

v 1 (x)=-^X2, v 2 {x) = 0 (15) 

Vfe generated 18 trajectories for tj= 0.6 seconds. The trajectories along with their time and 0 contours are shown in 
Figures 1-2. The magnitude of the ship’s velocity relative to water; V, is chosen to be 10 and h, a constant parameter, 
is chosen to equal 2. The optimal solutions for this case can be obtained in closed-form [Bryson and Ho, 1975], but 
our figures are generated by backward integration. 

The map 6 = 9(x) modulo 27r has a discontinuity due to the modulo operation. For example, a training trajectory 
can start with an initial heading angle of 330°, and the angle increases gradually until it reaches 360° (at which 0 
becomes 0“) and end at the final heading angle of 10°. The discontinuity occurs at the transition point of 360700. 
Sugeno Engines encounter difficulties in approximating discontinuous maps. 

To eliminate this problem, we use two Sugeno engines to approximate the sine and cosine of the heading angle as 
a function of the state and then to combine them after the approximation. Hence, we approximate U\ (a;) = cos 9, 
and u 2 (x) = sin 9 using two Sugeno engines and then we combine the two approximates using the formula 
0 = arctan ^ f(jr use in Eq. (2). 

5. Simulation and Results 

Now we summarize the procedure followed in this paper and show and discuss the results. 

1. Generate the training data 

- Starting from the docking position Xf and a large t /, integrate Eqs. (2) and (4) backwards in time from final 
conditions a ;(£/ ) = x/ and 9(tf ) = till t=0. This will generate one extremum (actually optimum) 
trajectory for every choice of Of. 

- During an integration record the state x(t) and the control 0{t) as the input and the output training data. The 
optimal time-to-go for that state is tf-t. 

- Generate a set of trajectories by choosing a set of final values Of that is fine enough as to cover the regions 

of interest in the state space with enough optimum trajectories. Figures 1 and 2 show the generated optimum 
trajectories for Gase 1 along with the associated time and control contours. 

2. Perform the least-squares recursion to obtain the consequence coefficients C. There are two sets of coefficients, 
one for the. sine, one for the cosine. 

3. Generate the testing data set. This is achieved by choosing a set of initial conditions z(O). We considered two 
testing sets: 

- One set was generated by taking the values of the state at the end of the backward integration conducted in 

step 1 above. \W refer to this as Testing Set 1. If the approximation was perfect, the feedback control law will 
regenerate the optimum trajectories of step 1 . 

- Another set was generated more or less randomly near the edge of the region of interest. 

4. Simulate the feedback-controlled ship motion. 

5. Consider the two performance measures . 

(a) The approximation error index J defined in Eq. (9). 

(b) How close the trajectories of the feedback-controlled ship matched the trajectories of the optimally-controlled 

ship. 

Wfe used five attributes for each input variable. Therefore, there are 25 possible rides, but six of these rides could be 
eliminated. The original Sugeno Controller yielded a very good approximation. The optimum trajectories and those 
generated from the approximate feedback law (using Testing Set 1 ) are shown in Figure 3. The average error index for 
the OSC is 7.2901. 


172 




Figure 1: Optimal trajectories with equal-time contours. 


6. Conclusions 

Sugeno approximation and leaming-from-example were shown to yield a powerful and easy-to-use method to 
implement optimal control in feedback. Since the lack of readily-available feedback implementation is a major 
limitation of optimal control, this new method is promising and encouraging. 

In our approach, optimal control theory is used to generate a set of optimal state and control trajectories usually 
by backward integration, thus alleviating if not eliminating the need to solve two-point boundary value problems. 
Next, Sugeno Fuzzy Engines are taught to abstract and approximate the state-to-control mapping from these example 
trajectories. The original Sugeno engine was used to implement in feedback Zermelo’s optimal control. 
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Figure 2: Optimal trajectories with equal-angle contours, 



Figure 3 : Trajectories generated by the optimal and approximate control laws. 
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